Lexically Constrained Knowledge Distillation for Neural Machine Translation
نویسندگان
چکیده
Knowledge distillation is a representative approach in neural machine translation (NMT) for compressing large model into lightweight one. This first trains strong teacher model, and then forces more compact student to imitate the teacher. Although key successful knowledge constructing stronger using state-of-the-art NMT may remain inadequate owing errors. Accordingly, an severely degrades due error propagation, especially regarding words important sentence meaning. To mitigate degradation problem, we propose method lexical constraint as privileged information NMT. The proposed with constraint, list of automatically extracted from target training data. We configure according importance fallibility Models trained our result improved compared those baseline English↔German English↔Japanese tasks under condition without ensemble decoding beam-search decoding.
منابع مشابه
Ensemble Distillation for Neural Machine Translation
Knowledge distillation describes a method for training a student network to perform better by learning from a stronger teacher network. In this work, we run experiments with different kinds of teacher networks to enhance the translation performance of a student Neural Machine Translation (NMT) network. We demonstrate techniques based on an ensemble and a best BLEU teacher network. We also show ...
متن کاملPrior Knowledge Integration for Neural Machine Translation using Posterior Regularization
Although neural machine translation has made significant progress recently, how to integrate multiple overlapping, arbitrary prior knowledge sources remains a challenge. In this work, we propose to use posterior regularization to provide a general framework for integrating prior knowledge into neural machine translation. We represent prior knowledge sources as features in a log-linear model, wh...
متن کاملPre-Translation for Neural Machine Translation
Recently, the development of neural machine translation (NMT) has significantly improved the translation quality of automatic machine translation. While most sentences are more accurate and fluent than translations by statistical machine translation (SMT)-based systems, in some cases, the NMT system produces translations that have a completely different meaning. This is especially the case when...
متن کاملBilingually-constrained Phrase Embeddings for Machine Translation
We propose Bilingually-constrained Recursive Auto-encoders (BRAE) to learn semantic phrase embeddings (compact vector representations for phrases), which can distinguish the phrases with different semantic meanings. The BRAE is trained in a way that minimizes the semantic distance of translation equivalents and maximizes the semantic distance of nontranslation pairs simultaneously. After traini...
متن کاملNeural Name Translation Improves Neural Machine Translation
In order to control computational complexity, neural machine translation (NMT) systems convert all rare words outside the vocabulary into a single unk symbol. Previous solution (Luong et al., 2015) resorts to use multiple numbered unks to learn the correspondence between source and target rare words. However, testing words unseen in the training corpus cannot be handled by this method. And it a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Shizen gengo shori
سال: 2022
ISSN: ['1340-7619', '2185-8314']
DOI: https://doi.org/10.5715/jnlp.29.1082